• Introduction
    • Software Basics
    • Author Team
  • 1 Spatial Data Introduction
    • 1.1 Defining spatial data
    • 1.2 Spatial data formats
      • 1.2.1 Simple features
    • 1.3 Spatial data types
    • 1.4 Coordinate Reference System
    • Further resources
  • 2 Geocoding Resource Locations
    • 2.1 Overview
    • 2.2 Environment Setup
      • 2.2.1 Input/Output
      • 2.2.2 Load Libraries
      • 2.2.3 Load Data
    • 2.3 Geocode addresses
      • 2.3.1 Quality Control
      • 2.3.2 Selecting Geocoding Service
      • 2.3.3 Test Geocoding Service
      • 2.3.4 Prepare input parameter
      • 2.3.5 Batch Geocoding
    • 2.4 Convert to Spatial Data
      • 2.4.1 Spatial Reference Systems
      • 2.4.2 Enable Points
      • 2.4.3 Visualize Points
      • 2.4.4 Convert to Shapefile
  • 3 Buffer Analysis
    • 3.1 Overview
    • 3.2 Environment Setup
      • 3.2.1 Input/Output
      • 3.2.2 Load Libraries
      • 3.2.3 Load Data
    • 3.3 Simple Overlay Map
    • 3.4 Spatial Transformation
      • 3.4.1 Transform CRS
    • 3.5 Generate Buffers
      • 3.5.1 Visualize buffers
      • 3.5.2 Buffer union
      • 3.5.3 Save Data
    • 3.6 Rinse & Repeat
  • 4 Link Community Data
    • 4.1 Overview
    • 4.2 Environment Setup
      • 4.2.1 Packages used
      • 4.2.2 Required Inputs and Expected Outputs
      • 4.2.3 Load the packages
    • 4.3 Load data
    • 4.4 Clean & Merge Data
    • 4.5 Visualize Data
  • 5 Thematic Mapping
    • 5.1 Overview
    • 5.2 Environment Setup
    • 5.3 Load data
    • 5.4 Thematic Plotting
      • 5.4.1 Quantile
      • 5.4.2 Natural Breaks
      • 5.4.3 Standard Deviation
    • 5.5 Appendix
      • Set Color Palette
      • Use ColorBrewer
  • 6 Min. Dist Access Analysis
    • 6.1 Overview
    • 6.2 Environment Setup
      • 6.2.1 Packages used
      • 6.2.2 Required Inputs and Expected Outputs
      • 6.2.3 Load the packages
    • 6.3 Load data
    • 6.4 Calculate centroids
      • 6.4.1 Visualize & Confirm
    • 6.5 Standardize CRS
    • Calculate min distance
      • 6.5.1 Merge data back to Area Data
      • 6.5.2 Visualize & Confirm
    • 6.6 Save Data

Opioid Environment Toolkit

4 Link Community Data

4.1 Overview

Stuff

  • items

4.2 Environment Setup

To replicate the codes & functions illustrated in this tutorial, you’ll need to have R and RStudio downloaded and installed on your system. This tutorial assumes some familiarity with the R programming language.

4.2.1 Packages used

We will use the following packages in this tutorial:

  • sf: to manipulate spatial data
  • tmap: to visualize and create maps
  • units: to convert units within spatial data

4.2.2 Required Inputs and Expected Outputs

Our inputs will be:

  • a SHP zip code boundary file (“chicago_zips.shp”), a
  • a CSV file with COVID case data from the city data portal, and
  • a SHP file with the locations of our resources (“methadoneClinics.shp”),

We will calculate the minimum distance between the resources and the centroids of the zip codes, then save the results as a shapefile and as a CSV. Our final result will be a shapefile/CSV with the minimum distance value for each zip.

4.2.3 Load the packages

Load the libraries for use.

library(sf)
library(tmap)

4.3 Load data

First we’ll load the update zip code dataset from a previous labs.

chicago_zips <- read_sf("data/chicago_zips.shp")
str(chicago_zips)
## tibble [85 × 10] (S3: sf/tbl_df/tbl/data.frame)
##  $ ZCTA5CE10 : chr [1:85] "60501" "60007" "60651" "60652" ...
##  $ GEOID10   : chr [1:85] "60501" "60007" "60651" "60652" ...
##  $ CLASSFP10 : chr [1:85] "B5" "B5" "B5" "B5" ...
##  $ MTFCC10   : chr [1:85] "G6350" "G6350" "G6350" "G6350" ...
##  $ FUNCSTAT10: chr [1:85] "S" "S" "S" "S" ...
##  $ ALAND10   : chr [1:85] "12532295" "36493383" "9052862" "12987857" ...
##  $ AWATER10  : chr [1:85] "974360" "917560" "0" "0" ...
##  $ INTPTLAT10: chr [1:85] "+41.7802209" "+42.0086000" "+41.9020934" "+41.7479319" ...
##  $ INTPTLON10: chr [1:85] "-087.8232440" "-087.9973398" "-087.7408565" "-087.7147951" ...
##  $ geometry  :sfc_MULTIPOLYGON of length 85; first list element: List of 1
##   ..$ :List of 1
##   .. ..$ : num [1:671, 1:2] -87.9 -87.9 -87.9 -87.9 -87.9 ...
##   ..- attr(*, "class")= chr [1:3] "XY" "MULTIPOLYGON" "sfg"
##  - attr(*, "sf_column")= chr "geometry"
##  - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA
##   ..- attr(*, "names")= chr [1:9] "ZCTA5CE10" "GEOID10" "CLASSFP10" "MTFCC10" ...
meth_sf <- st_read("methadoneClinics.shp")
## Reading layer `methadoneClinics' from data source `/Users/maryniakolak/code/opioid-environment-toolkit/methadoneClinics.shp' using driver `ESRI Shapefile'
## Simple feature collection with 27 features and 8 fields
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: -87.7349 ymin: 41.68698 xmax: -87.57673 ymax: 41.96475
## CRS:            4326

Next, we’ll load some new data we’re interested in joining in: Chicago COVID-19 Cases, Tests, and Deaths by ZIP Code, found on the city data portal here. We’ll load in a CSV and inspect the data:

COVID <- read.csv("data/COVID-19_Cases__Tests__and_Deaths_by_ZIP_Code.csv")
str(COVID)
## 'data.frame':    1860 obs. of  21 variables:
##  $ ZIP.Code                            : Factor w/ 60 levels "60601","60602",..: 3 4 11 11 15 3 3 3 3 3 ...
##  $ Week.Number                         : int  39 39 16 15 11 10 11 12 13 14 ...
##  $ Week.Start                          : Factor w/ 31 levels "03/01/2020","03/08/2020",..: 30 30 7 6 2 1 2 3 4 5 ...
##  $ Week.End                            : Factor w/ 31 levels "03/07/2020","03/14/2020",..: 30 30 7 6 2 1 2 3 4 5 ...
##  $ Cases...Weekly                      : int  0 0 8 7 NA NA NA NA NA NA ...
##  $ Cases...Cumulative                  : int  13 31 72 64 NA NA NA NA NA NA ...
##  $ Case.Rate...Weekly                  : int  0 0 25 22 NA NA NA NA NA NA ...
##  $ Case.Rate...Cumulative              : num  1107 3964 222 197 NA ...
##  $ Tests...Weekly                      : int  25 12 101 59 6 0 0 1 3 4 ...
##  $ Tests...Cumulative                  : int  327 339 450 349 9 0 0 1 4 8 ...
##  $ Test.Rate...Weekly                  : int  2130 1534 312 182 14 0 0 85 256 341 ...
##  $ Test.Rate...Cumulative              : num  27853.5 43350.4 1387.8 1076.3 21.7 ...
##  $ Percent.Tested.Positive...Weekly    : num  0 0 0.1 0.1 NA NA NA NA NA NA ...
##  $ Percent.Tested.Positive...Cumulative: num  0 0.1 0.2 0.2 NA NA NA NA NA NA ...
##  $ Deaths...Weekly                     : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Deaths...Cumulative                 : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Death.Rate...Weekly                 : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ Death.Rate...Cumulative             : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ Population                          : int  1174 782 32426 32426 41563 1174 1174 1174 1174 1174 ...
##  $ Row.ID                              : Factor w/ 1860 levels "60601-10","60601-11",..: 92 123 317 316 436 63 64 65 66 67 ...
##  $ ZIP.Code.Location                   : Factor w/ 60 levels "","POINT (-87.556037 41.653147)",..: 13 15 9 9 5 13 13 13 13 13 ...

4.4 Clean & Merge Data

#COVID1 <- COVID[, c("`ZIP Code`", "`Percent Tested Positive - Cumulative`","Population")]
#head(COVID1)
head(chicago_zips)
## Simple feature collection with 6 features and 9 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -88.06058 ymin: 41.73452 xmax: -87.58209 ymax: 42.04052
## CRS:            4326
## # A tibble: 6 x 10
##   ZCTA5CE10 GEOID10 CLASSFP10 MTFCC10 FUNCSTAT10 ALAND10 AWATER10 INTPTLAT10 INTPTLON10
##   <chr>     <chr>   <chr>     <chr>   <chr>      <chr>   <chr>    <chr>      <chr>     
## 1 60501     60501   B5        G6350   S          125322… 974360   +41.78022… -087.8232…
## 2 60007     60007   B5        G6350   S          364933… 917560   +42.00860… -087.9973…
## 3 60651     60651   B5        G6350   S          9052862 0        +41.90209… -087.7408…
## 4 60652     60652   B5        G6350   S          129878… 0        +41.74793… -087.7147…
## 5 60653     60653   B5        G6350   S          6041418 1696670  +41.81996… -087.6059…
## 6 60654     60654   B5        G6350   S          1464813 113471   +41.89182… -087.6383…
## # … with 1 more variable: geometry <MULTIPOLYGON [°]>
COVID$GEOID10<- as.character(COVID$ZIP.Code)

Let’s merge the data using the zip code geographic identifier, “ZIP Code” field, to bring in the the Percent Tested Positive - Cumalative dataset.

new <- merge(chicago_zips, COVID, by = "GEOID10")
head(new)
## Simple feature collection with 6 features and 30 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -87.63396 ymin: 41.88083 xmax: -87.6129 ymax: 41.88893
## CRS:            4326
##   GEOID10 ZCTA5CE10 CLASSFP10 MTFCC10 FUNCSTAT10 ALAND10 AWATER10  INTPTLAT10   INTPTLON10 ZIP.Code
## 1   60601     60601        B5   G6350          S  934226    60682 +41.8856419 -087.6215226    60601
## 2   60601     60601        B5   G6350          S  934226    60682 +41.8856419 -087.6215226    60601
## 3   60601     60601        B5   G6350          S  934226    60682 +41.8856419 -087.6215226    60601
## 4   60601     60601        B5   G6350          S  934226    60682 +41.8856419 -087.6215226    60601
## 5   60601     60601        B5   G6350          S  934226    60682 +41.8856419 -087.6215226    60601
## 6   60601     60601        B5   G6350          S  934226    60682 +41.8856419 -087.6215226    60601
##   Week.Number Week.Start   Week.End Cases...Weekly Cases...Cumulative Case.Rate...Weekly
## 1          39 09/20/2020 09/26/2020              8                213                 54
## 2          33 08/09/2020 08/15/2020              8                128                 54
## 3          34 08/16/2020 08/22/2020              7                135                 48
## 4          25 06/14/2020 06/20/2020              5                 82                 34
## 5          13 03/22/2020 03/28/2020              9                 23                 61
## 6          22 05/24/2020 05/30/2020              2                 70                 14
##   Case.Rate...Cumulative Tests...Weekly Tests...Cumulative Test.Rate...Weekly Test.Rate...Cumulative
## 1                 1451.4            202               4304               1376                29328.8
## 2                  872.2            216               2303               1472                15693.4
## 3                  919.9            240               2543               1635                17328.8
## 4                  558.8            100                881                681                 6003.4
## 5                  156.7             39                 79                266                  538.3
## 6                  477.0             92                617                627                 4204.4
##   Percent.Tested.Positive...Weekly Percent.Tested.Positive...Cumulative Deaths...Weekly
## 1                              0.0                                  0.0               1
## 2                              0.0                                  0.1               0
## 3                              0.0                                  0.1               0
## 4                              0.0                                  0.1               0
## 5                              0.2                                  0.3               0
## 6                              0.0                                  0.1               0
##   Deaths...Cumulative Death.Rate...Weekly Death.Rate...Cumulative Population   Row.ID
## 1                   6                 6.8                    40.9      14675 60601-39
## 2                   5                 0.0                    34.1      14675 60601-33
## 3                   5                 0.0                    34.1      14675 60601-34
## 4                   5                 0.0                    34.1      14675 60601-25
## 5                   0                 0.0                     0.0      14675 60601-13
## 6                   4                 0.0                    27.3      14675 60601-22
##              ZIP.Code.Location                       geometry
## 1 POINT (-87.622844 41.886262) MULTIPOLYGON (((-87.63396 4...
## 2 POINT (-87.622844 41.886262) MULTIPOLYGON (((-87.63396 4...
## 3 POINT (-87.622844 41.886262) MULTIPOLYGON (((-87.63396 4...
## 4 POINT (-87.622844 41.886262) MULTIPOLYGON (((-87.63396 4...
## 5 POINT (-87.622844 41.886262) MULTIPOLYGON (((-87.63396 4...
## 6 POINT (-87.622844 41.886262) MULTIPOLYGON (((-87.63396 4...
new$COVIDCaseRt <- new$Case.Rate...Cumulative 
str(new)
## Classes 'sf' and 'data.frame':   1798 obs. of  32 variables:
##  $ GEOID10                             : chr  "60601" "60601" "60601" "60601" ...
##  $ ZCTA5CE10                           : chr  "60601" "60601" "60601" "60601" ...
##  $ CLASSFP10                           : chr  "B5" "B5" "B5" "B5" ...
##  $ MTFCC10                             : chr  "G6350" "G6350" "G6350" "G6350" ...
##  $ FUNCSTAT10                          : chr  "S" "S" "S" "S" ...
##  $ ALAND10                             : chr  "934226" "934226" "934226" "934226" ...
##  $ AWATER10                            : chr  "60682" "60682" "60682" "60682" ...
##  $ INTPTLAT10                          : chr  "+41.8856419" "+41.8856419" "+41.8856419" "+41.8856419" ...
##  $ INTPTLON10                          : chr  "-087.6215226" "-087.6215226" "-087.6215226" "-087.6215226" ...
##  $ ZIP.Code                            : Factor w/ 60 levels "60601","60602",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ Week.Number                         : int  39 33 34 25 13 22 27 35 24 11 ...
##  $ Week.Start                          : Factor w/ 31 levels "03/01/2020","03/08/2020",..: 30 24 25 16 4 13 18 26 15 2 ...
##  $ Week.End                            : Factor w/ 31 levels "03/07/2020","03/14/2020",..: 30 24 25 16 4 13 18 26 15 2 ...
##  $ Cases...Weekly                      : int  8 8 7 5 9 2 3 13 6 NA ...
##  $ Cases...Cumulative                  : int  213 128 135 82 23 70 89 148 77 NA ...
##  $ Case.Rate...Weekly                  : int  54 54 48 34 61 14 20 89 41 NA ...
##  $ Case.Rate...Cumulative              : num  1451 872 920 559 157 ...
##  $ Tests...Weekly                      : int  202 216 240 100 39 92 132 254 93 6 ...
##  $ Tests...Cumulative                  : int  4304 2303 2543 881 79 617 1161 2797 781 7 ...
##  $ Test.Rate...Weekly                  : int  1376 1472 1635 681 266 627 900 1731 634 41 ...
##  $ Test.Rate...Cumulative              : num  29329 15693 17329 6003 538 ...
##  $ Percent.Tested.Positive...Weekly    : num  0 0 0 0 0.2 0 0 0.1 0.1 NA ...
##  $ Percent.Tested.Positive...Cumulative: num  0 0.1 0.1 0.1 0.3 0.1 0.1 0.1 0.1 NA ...
##  $ Deaths...Weekly                     : int  1 0 0 0 0 0 0 0 0 0 ...
##  $ Deaths...Cumulative                 : int  6 5 5 5 0 4 5 5 5 0 ...
##  $ Death.Rate...Weekly                 : num  6.8 0 0 0 0 0 0 0 0 0 ...
##  $ Death.Rate...Cumulative             : num  40.9 34.1 34.1 34.1 0 27.3 34.1 34.1 34.1 0 ...
##  $ Population                          : int  14675 14675 14675 14675 14675 14675 14675 14675 14675 14675 ...
##  $ Row.ID                              : Factor w/ 1860 levels "60601-10","60601-11",..: 30 24 25 16 4 13 18 26 15 2 ...
##  $ ZIP.Code.Location                   : Factor w/ 60 levels "","POINT (-87.556037 41.653147)",..: 11 11 11 11 11 11 11 11 11 11 ...
##  $ geometry                            :sfc_MULTIPOLYGON of length 1798; first list element: List of 1
##   ..$ :List of 1
##   .. ..$ : num [1:183, 1:2] -87.6 -87.6 -87.6 -87.6 -87.6 ...
##   ..- attr(*, "class")= chr  "XY" "MULTIPOLYGON" "sfg"
##  $ COVIDCaseRt                         : num  1451 872 920 559 157 ...
##  - attr(*, "sf_column")= chr "geometry"
##  - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
##   ..- attr(*, "names")= chr  "GEOID10" "ZCTA5CE10" "CLASSFP10" "MTFCC10" ...
union.buffers<- st_read("data/mclinicarea.shp")
## Reading layer `mclinicarea' from data source `/Users/maryniakolak/code/opioid-environment-toolkit/data/mclinicarea.shp' using driver `ESRI Shapefile'
## Simple feature collection with 1 feature and 1 field
## geometry type:  POLYGON
## dimension:      XY
## bbox:           xmin: 1136699 ymin: 1818770 xmax: 1201240 ymax: 1941031
## CRS:            3435

4.5 Visualize Data

tm_shape(new) +
  tm_polygons("COVIDCaseRt", style="jenks", pal="BuPu",
              title = "COVID Case Rate") +
  tm_shape(union.buffers) + tm_borders(col = "blue") +
  tm_shape(meth_sf) + tm_dots(col = "black", size = 0.2) 
## Warning in `$.crs`(gm$shape.master_crs, "proj4string"): CRS uses proj4string, which is deprecated.

## Warning in `$.crs`(gm$shape.master_crs, "proj4string"): CRS uses proj4string, which is deprecated.
## Warning in `$.crs`(crs, "proj4string"): CRS uses proj4string, which is deprecated.
## Warning in `$.crs`(gm$shape.master_crs, "proj4string"): CRS uses proj4string, which is deprecated.